13 research outputs found

    行動認識機械学習データセット収集のためのクラウドソーシングの研究

    Get PDF
    In this thesis, we propose novel methods to explore and improve crowdsourced data labeling for mobile activity recognition. This thesis concerns itself with the quality (i.e., the performance of a classification model), quantity (i.e., the number of data collected), and motivation (i.e., the process that initiates and maintains goal-oriented behaviors) of participant contributions in mobile activity data collection studies. We focus on achieving high-quality and consistent ground-truth labeling and, particularly, on user feedback’s impact under different conditions. Although prior works have used several techniques to improve activity recognition performance, differences to our approach exist in terms of the end goals, proposed method, and implementation. Many researchers commonly investigate post-data collection to increase activity recognition accuracy, such as implementing advanced machine learning algorithms to improve data quality or exploring several preprocessing ways to increase data quantity. However, utilizing post-data collection results is very difficult and time-consuming due to dirty data challenges for most real-world situations. Unlike those commonly used in other literature, in this thesis, we aim to motivate and sustain user engagement during their on-going-self-labeling task to optimize activity recognition accuracy. The outline of the thesis is as follows: In chapter 1 and 2, we briefly introduce the thesis work and literature review. In Chapter 3, we introduce novel gamified active learning and inaccuracy detection for crowdsourced data labeling for an activity recognition system (CrowdAct) using mobile sensing. We exploited active learning to address the lack of accurate information. We presented the integration of gamification into active learning to overcome the lack of motivation and sustained engagement. We introduced an inaccuracy detection algorithm to minimize inaccurate data. In Chapter 4, we introduce a novel method to exploit on-device deep learning inference using a long short-term memory (LSTM)-based approach to alleviate the labeling effort and ground truth data collection in activity recognition systems using smartphone sensors. The novel idea behind this is that estimated activities are used as feedback for motivating users to collect accurate activity labels. In Chapter 5, we introduce a novel on-device personalization for data labeling for an activity recognition system using mobile sensing. The key idea behind this system is that estimated activities personalized for a specific individual user can be used as feedback to motivate user contribution and improve data labeling quality. We exploited finetuning using a Deep Recurrent Neural Network (RNN) to address the lack of sufficient training data and minimize the need for training deep learning on mobile devices from scratch. We utilized a model pruning technique to reduce the computation cost of on-device personalization without affecting the accuracy. Finally, we built a robust activity data labeling system by integrating the two techniques outlined above, allowing the mobile application to create a personalized experience for the user. To demonstrate the proposed methods’ capability and feasibility in realistic settings, we developed and deployed the systems to real-world settings such as crowdsourcing. For the process of data labeling, we challenged online and self-labeling scenarios using inertial smartphone sensors, such as accelerometers. We recruited diverse participants and con- ducted the experiments both in a laboratory setting and in a semi-natural setting. We also applied both manual labeling and the assistance of semi-automated labeling. Addition- ally, we gathered massive labeled training data in activity recognition using smartphone sensors and other information such as user demographics and engagement. Chapter 6 offers a brief discussion of the thesis. In Chapter 7, we conclude the thesis with conclusion and some future work issues. We empirically evaluated these methods across various study goals such as machine learning and descriptive and inferential statistics. Our results indicated that this study enabled us to effectively collect crowdsourced activity data. Our work revealed clear opportunities and challenges in combining human and mobile phone-based sensing techniques for researchers interested in studying human behavior in situ. Researchers and practitioners can apply our findings to improve recognition accuracy and reduce unreliable labels by human users, increase the total number of collected responses, as well as enhance participant motivation for activity data collection.九州工業大学博士学位論文 学位記番号:工博甲第526号 学位授与年月日:令和3年6月28日1 Introduction|2 Related work|3 Achieving High-Quality Crowdsourced Datasets in Mobile Activity Recognition|4 On-Device Deep Learning Inference for Activity Data Collection|5 On-Device Deep Personalization for Activity Data Collection|6 Discussion|7 Conclusion九州工業大学令和3年

    On-Device Deep Personalization for Robust Activity Data Collection

    Get PDF
    One of the biggest challenges of activity data collection is the need to rely on users and keep them engaged to continually provide labels. Recent breakthroughs in mobile platforms have proven effective in bringing deep neural networks powered intelligence into mobile devices. This study proposes a novel on-device personalization for data labeling for an activity recognition system using mobile sensing. The key idea behind this system is that estimated activities personalized for a specific individual user can be used as feedback to motivate user contribution and improve data labeling quality. First, we exploited fine-tuning using a Deep Recurrent Neural Network to address the lack of sufficient training data and minimize the need for training deep learning on mobile devices from scratch. Second, we utilized a model pruning technique to reduce the computation cost of on-device personalization without affecting the accuracy. Finally, we built a robust activity data labeling system by integrating the two techniques outlined above, allowing the mobile application to create a personalized experience for the user. To demonstrate the proposed model’s capability and feasibility, we developed and deployed the proposed system to realistic settings. For our experimental setup, we gathered more than 16,800 activity windows from 12 activity classes using smartphone sensors. We empirically evaluated the proposed quality by comparing it with a baseline using machine learning. Our results indicate that the proposed system effectively improved activity accuracy recognition for individual users and reduced cost and latency for inference for mobile devices. Based on our findings, we highlight critical and promising future research directions regarding the design of efficient activity data collection with on-device personalization

    行動認識機械学習データセット収集のためのクラウドソーシングの研究

    Get PDF
    九州工業大学博士学位論文(要旨)学位記番号:工博甲第526号 学位授与年月日:令和3年6月28

    Automatic Labeled Dialogue Generation for Nursing Record Systems

    Get PDF
    The integration of digital voice assistants in nursing residences is becoming increasingly important to facilitate nursing productivity with documentation. A key idea behind this system is training natural language understanding (NLU) modules that enable the machine to classify the purpose of the user utterance (intent) and extract pieces of valuable information present in the utterance (entity). One of the main obstacles when creating robust NLU is the lack of sufficient labeled data, which generally relies on human labeling. This process is cost-intensive and time-consuming, particularly in the high-level nursing care domain, which requires abstract knowledge. In this paper, we propose an automatic dialogue labeling framework of NLU tasks, specifically for nursing record systems. First, we apply data augmentation techniques to create a collection of variant sample utterances. The individual evaluation result strongly shows a stratification rate, with regard to both fluency and accuracy in utterances. We also investigate the possibility of applying deep generative models for our augmented dataset. The preliminary character-based model based on long short-term memory (LSTM) obtains an accuracy of 90% and generates various reasonable texts with BLEU scores of 0.76. Secondly, we introduce an idea for intent and entity labeling by using feature embeddings and semantic similarity-based clustering. We also empirically evaluate different embedding methods for learning good representations that are most suitable to use with our data and clustering tasks. Experimental results show that fastText embeddings produce strong performances both for intent labeling and on entity labeling, which achieves an accuracy level of 0.79 and 0.78 f1-scores and 0.67 and 0.61 silhouette scores, respectively

    On-Device Deep Learning Inference for Efficient Activity Data Collection

    Get PDF
    Labeling activity data is a central part of the design and evaluation of human activity recognition systems. The performance of the systems greatly depends on the quantity and “quality” of annotations; therefore, it is inevitable to rely on users and to keep them motivated to provide activity labels. While mobile and embedded devices are increasingly using deep learning models to infer user context, we propose to exploit on-device deep learning inference using a long short-term memory (LSTM)-based method to alleviate the labeling effort and ground truth data collection in activity recognition systems using smartphone sensors. The novel idea behind this is that estimated activities are used as feedback for motivating users to collect accurate activity labels. To enable us to perform evaluations, we conduct the experiments with two conditional methods. We compare the proposed method showing estimated activities using on-device deep learning inference with the traditional method showing sentences without estimated activities through smartphone notifications. By evaluating with the dataset gathered, the results show our proposed method has improvements in both data quality (i.e., the performance of a classification model) and data quantity (i.e., the number of data collected) that reflect our method could improve activity data collection, which can enhance human activity recognition systems. We discuss the results, limitations, challenges, and implications for on-device deep learning inference that support activity data collection. Also, we publish the preliminary dataset collected to the research community for activity recognition

    On-Device Deep Personalization for Robust Activity Data Collection

    No full text
    One of the biggest challenges of activity data collection is the need to rely on users and keep them engaged to continually provide labels. Recent breakthroughs in mobile platforms have proven effective in bringing deep neural networks powered intelligence into mobile devices. This study proposes a novel on-device personalization for data labeling for an activity recognition system using mobile sensing. The key idea behind this system is that estimated activities personalized for a specific individual user can be used as feedback to motivate user contribution and improve data labeling quality. First, we exploited fine-tuning using a Deep Recurrent Neural Network to address the lack of sufficient training data and minimize the need for training deep learning on mobile devices from scratch. Second, we utilized a model pruning technique to reduce the computation cost of on-device personalization without affecting the accuracy. Finally, we built a robust activity data labeling system by integrating the two techniques outlined above, allowing the mobile application to create a personalized experience for the user. To demonstrate the proposed model’s capability and feasibility, we developed and deployed the proposed system to realistic settings. For our experimental setup, we gathered more than 16,800 activity windows from 12 activity classes using smartphone sensors. We empirically evaluated the proposed quality by comparing it with a baseline using machine learning. Our results indicate that the proposed system effectively improved activity accuracy recognition for individual users and reduced cost and latency for inference for mobile devices. Based on our findings, we highlight critical and promising future research directions regarding the design of efficient activity data collection with on-device personalization

    Automatic Labeled Dialogue Generation for Nursing Record Systems

    No full text
    The integration of digital voice assistants in nursing residences is becoming increasingly important to facilitate nursing productivity with documentation. A key idea behind this system is training natural language understanding (NLU) modules that enable the machine to classify the purpose of the user utterance (intent) and extract pieces of valuable information present in the utterance (entity). One of the main obstacles when creating robust NLU is the lack of sufficient labeled data, which generally relies on human labeling. This process is cost-intensive and time-consuming, particularly in the high-level nursing care domain, which requires abstract knowledge. In this paper, we propose an automatic dialogue labeling framework of NLU tasks, specifically for nursing record systems. First, we apply data augmentation techniques to create a collection of variant sample utterances. The individual evaluation result strongly shows a stratification rate, with regard to both fluency and accuracy in utterances. We also investigate the possibility of applying deep generative models for our augmented dataset. The preliminary character-based model based on long short-term memory (LSTM) obtains an accuracy of 90% and generates various reasonable texts with BLEU scores of 0.76. Secondly, we introduce an idea for intent and entity labeling by using feature embeddings and semantic similarity-based clustering. We also empirically evaluate different embedding methods for learning good representations that are most suitable to use with our data and clustering tasks. Experimental results show that fastText embeddings produce strong performances both for intent labeling and on entity labeling, which achieves an accuracy level of 0.79 and 0.78 f1-scores and 0.67 and 0.61 silhouette scores, respectively

    On-Device Deep Learning Inference for Efficient Activity Data Collection

    No full text
    Labeling activity data is a central part of the design and evaluation of human activity recognition systems. The performance of the systems greatly depends on the quantity and “quality” of annotations; therefore, it is inevitable to rely on users and to keep them motivated to provide activity labels. While mobile and embedded devices are increasingly using deep learning models to infer user context, we propose to exploit on-device deep learning inference using a long short-term memory (LSTM)-based method to alleviate the labeling effort and ground truth data collection in activity recognition systems using smartphone sensors. The novel idea behind this is that estimated activities are used as feedback for motivating users to collect accurate activity labels. To enable us to perform evaluations, we conduct the experiments with two conditional methods. We compare the proposed method showing estimated activities using on-device deep learning inference with the traditional method showing sentences without estimated activities through smartphone notifications. By evaluating with the dataset gathered, the results show our proposed method has improvements in both data quality (i.e., the performance of a classification model) and data quantity (i.e., the number of data collected) that reflect our method could improve activity data collection, which can enhance human activity recognition systems. We discuss the results, limitations, challenges, and implications for on-device deep learning inference that support activity data collection. Also, we publish the preliminary dataset collected to the research community for activity recognition
    corecore